Studying text coherence in Czech – a corpus-based analysis
نویسندگان
چکیده
منابع مشابه
Corpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملCzech Text Document Corpus v 2.0
This paper introduces “Czech Text Document Corpus v 2.0”, a collection of text documents for automatic document classification in Czech language. It is composed of 11,955 text documents provided by the Czech News Agency and is freely available for research purposes at http://home.zcu.cz/ ̃pkral/sw/ . This corpus was created in order to facilitate a straightforward comparison of the document clas...
متن کاملRepresenting discourse coherence: A corpus-based analysis
We present a set of discourse structure relations that are easy to code, and develop criteria for an appropriate data structure for representing these relations. Discourse structure here refers to informational relations that hold between sentences in a discourse (cf. Hobbs, 1985). We evaluated whether trees are a descriptively adequate data structure for representing coherence. Trees are widel...
متن کاملStudying Properties of Czech Complex Sentences from an Annotated Corpus
The paper deals with the problem of an analysis of complex sentences in Czech on the basis of manually annotated data. The availability of a specialized corpus explicitly describing mutual relationships between segments and clauses in Czech complex sentences, together with the availability of a thoroughly syntactically annotated corpus, the Prague Dependency Treebank, provide a solid background...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Topics in Linguistics
سال: 2017
ISSN: 2199-6504,1337-7590
DOI: 10.1515/topling-2017-0009